Add optional RNNoise support to AudioBridge #3185

lminiero · 2023-03-17T15:32:04Z

This PR is an attempt to revive an effort initially contributed by @mirkobrankovic in #2260, by adding an optional and configurable support for RNNoise to the AudioBridge plugin: the idea is to basically add a mechanism to perform noise reduction in audio rooms with the help of the RNNoise library.

This patch is is quite different from Mirko's original contribution, though:

First of all, the original patch performed noise reduction on the resulting mix: my patch, instead, has separate denoising contexts for each participant, meaning that each participant may or may not be denoised (it can be changed dynamically). The main reason for that is to improve the end result, since if many speakers are noisy, adding their contributions to the mix sums noise too: denoising participants at the "source", instead, should help clean up the signal before it's added to the mix, at the expense of course of some more CPU usage due to the multiple denoisers.
Besides, Mirko's patch had a peculiar "packet skipping" feature, where you could tell the code to only denoise N packets out of M: not sure why it was done that way (the comments suggested the audio would be too robotic otherwise), but I didn't find any note related to that in the search I made for common practices when using RNNoise, so I didn't do that: in my patch, if denoising is enabled, you denoise all packets, so it's a on/off kind of thing.

Apart from that, I added ways to selectively enable/disable the feature. First of all, you can configure a room to enable denoising by default, by setting the denoise property to true: in that case, any participant that joins the room resunts in a denoiser instance created for them, unless they explicitly provide a denoise: false property when joining. A participant that joins a room where denoising is not enabled by default, can instead create a denoiser by joining with a denoise: true property. Denoising can also be enabled/disabled dynamically: participants can do that for themselves via a configure request, while room owners can use the synchronous denoise_enable and denoise_disable requests instead, by specifying the room and the specific participant to impact.

Coming to the actual implementation, it only partly works, due to some constraints in the RNNoise library that I haven't overcome entirely. Specifically, the rnnoise_process_frame function expects a buffer of exactly 480 samples to denoise: it can't be more and it can't be less. Depending on the sampling rate in use in the room, a single audio packet of 20ms received via WebRTC can contain a different number of samples: when using 16000 as a sampling rate, for instance, it will be 320 samples; 480 for 24000; 960 for 48000. This means that, at the moment, denoising works fine when you use a sampling rate of 24000 or 48000 (since can do one or two rounds of denoising with samples that are multiples of 480), while you get audio artifacts when using lower sampling rates instead. At the moment, I'm also getting artifacts when using stereo rooms (spatial audio): I do know that Opus uses interleaving when stereo is used, but even taking that into account (and so performing multiple rounds of denoising on different subsets of 480 samples) the artifacts are still there.

For this reasons, I decided to submit this as a draft pull request: in fact, it kinda works already, but it also definitely needs improvements, so I'm hoping that some fresh pairs of eyes looking at the code and/or people testing and providing feedback may help address what's currently not 100% working instead.

Tagging as multistream since I implemented this in 1.x, but this will be trivial to port to 0.x as well should this be merged eventually.

lminiero · 2023-05-25T15:17:48Z

I spent some time investigating what was wrong, and managed to fix the broken audio and artifacts in all combinations of sampling rate, mono and stereo alike. It seems to be working fine to me, now (even disabling/enabling denoising on the fly, where you can actually hear it at work), so I marked the PR as ready for review. That said, I'll obviously wait for more feedback before (if) we merge this, also taking into account the jitter buffer PR would need to be merged first (which means I'll have to rebase this accordingly).

IlgamGabdullin · 2023-08-21T15:20:43Z

Hi! Is there a chance that this feature will be in VideoRoom Plugin?

lminiero · 2023-09-07T10:06:21Z

Hi! Is there a chance that this feature will be in VideoRoom Plugin?

No, because the VideoRoom plugin only relays packets, it doesn't decode them as the AudioBridge does.

Odinvt · 2023-09-08T13:48:31Z

Hi! Thanks for this PR!

We were using a frontend version of the same RNNoise with the default model via WASM and it doesn't affect the original voice quality much. (https://github.com/jitsi/rnnoise-wasm)

But, when testing this PR it results in a very "strangled" voice quality compared to the WASM one.

Would you say that this is related to the fact that it's used on the original lossless microphone samples before encoding frontend side, or rather something related to the usage of RNNoise in this PR ?

PS: I could provide 3 simultaneous audio recordings of the original, denoised-frontend, denoised-janus sound in a noisy environment if it helps

Thanks again !

lminiero · 2023-09-08T13:52:34Z

Would you say that this is related to the fact that it's used on the original lossless microphone samples before encoding frontend side, or rather something related to the usage of RNNoise in this PR ?

I think it depends on the sampling rate you're using in the AudioBridge. If you use, e.g., 16000, resampling will be performed, and I think it may be that the way we do it causes what you hear. If in a 48000 room the audio is good, and comparable in your tests, that's likely it.

Odinvt · 2023-09-08T15:26:35Z

Indeed i retested with 24000/48000 room sampling rate and it works perfectly fine. I'll be moving this to production tonight for some 500 daily users (some OPUS/48000 some PCMA/8000 (SIP & PSTN bridged from the SIP plugin to the AudioBridge plugin)) I'll let you guys know how it works out for quality & CPU usage. Thanks again great work 🙏

Odinvt · 2023-09-11T14:29:52Z

We've seen no significant CPU usage increase with this enabled in production (2 to 4 percentage points increase in average CPU usage across multiple 32 CPU VMs). Although, it does seem to add some "barely noticeable" cumulative delay in audio contributions with denoise enabled (just under the 200ms mark) for hour long conferences.

atoppi · 2023-09-12T10:27:56Z

@Odinvt are you sure the observed delay is due to the denoiser?
We recently merged the jitter buffer in the master branch (now merged in this PR) and that might also explain a bit of latency.
To double-check this, use the Admin API for an handle with the delay and read the values of buffer-in and queue-in.

Odinvt · 2023-09-12T10:32:16Z

@atoppi Indeed I was on v1.1.4 stable before. I only switched to v1.2.0 because this merge was rebased on top of speexdsp's jitter buffer. So i never got to actually test v1.2.0 separately. I will pay attention to the mentioned values for both denoise enabled and disabled, and get back to you asap. Thanks !

spscream · 2023-10-25T15:11:03Z

Hi, we are testing your PR. Denoise is working when initially joined with denoise: true. But changes on the fly with new configure request aren't working:

{"body":{"denoise":false,"request":"configure"},"handle_id":8708560238099099,"janus":"message","session_id":6601772272643216,"transaction":"QUrR+WT5jq4"}
{"body":{"denoise":true,"request":"configure"},"handle_id":8708560238099099,"janus":"message","session_id":6601772272643216,"transaction":"+6ZA40wJa3g"}

requests abobe doesn't change anything on the fly. Is anything wrong with them?
I also tried with ice restart, sending new jsep and denoise flag, but it does no effect.

brave44 · 2023-10-27T09:10:47Z

@lminiero any plans to merge this?

lminiero · 2023-10-27T09:20:52Z

requests abobe doesn't change anything on the fly

@spscream Does it work when using the synchronous denoise_enable and denoise_disable requests instead?

@brave44 we do plan to merge this, but not sure yet, as we're still waiting for more feedback; besides, we may have to improve how it works when doing 8000/16000 sampling rates.

brave44 · 2023-10-27T09:34:37Z

@brave44 we do plan to merge this, but not sure yet, as we're still waiting for more feedback; besides, we may have to improve how it works when doing 8000/16000 sampling rates.

Thanks, yeah, for us it works fine with 4800 sampling rate.

atoppi · 2023-11-16T18:16:32Z

@Odinvt @spscream @brave44 the algorithm has been totally refactored, could you please test the last revision?

…tr to avoid warning.

Odinvt · 2023-11-16T21:42:15Z

@Odinvt are you sure the observed delay is due to the denoiser? We recently merged the jitter buffer in the master branch (now merged in this PR) and that might also explain a bit of latency. To double-check this, use the Admin API for an handle with the delay and read the values of buffer-in and queue-in.

I apologize for the late reply. The audio delay was due to our "tree" scaling's forwards going through a VXLAN over VPN which encryption was not hardware accelerated.
We've been running this branch with the jitter buffer rebase since September with no issues with 24000/48000 sampling rates in production after successful stress tests.

requests abobe doesn't change anything on the fly

@spscream Does it work when using the synchronous denoise_enable and denoise_disable requests instead?

@lminiero the synchronous requests work fine we've been using them in production also.

@Odinvt @spscream @brave44 the algorithm has been totally refactored, could you please test the last revision?

@atoppi Will do. Thanks !

spscream · 2023-11-17T07:54:39Z

@atoppi thanks, we will check it

lminiero · 2023-11-20T11:56:28Z

Can't help on forks, sorry. If you can find if any commit we made is causing the problem, please do let us know.

spscream · 2023-11-20T12:03:33Z

@lminiero sure, we will do our best.
But fork isn't related to ab, only vr logic affected(some custom fields added to participants state).

spscream · 2023-11-21T09:19:26Z

@lminiero we checked on master + our commit today and we have no troubles on it. We will check the latest changes on current branch afternoon.

spscream · 2023-11-21T10:24:45Z

@lminiero moved to latest changes from this branch, no problems now for us.
We will monitor it until tomorrow, looks like everything is fine.

…or resamplers. Always return the denoise flag in the admin response. Add missing docs.

spscream · 2023-11-27T14:19:22Z

We have some troubles with audiobridge. If publisher have bad network connection we hear crack sounds and it appears on every call with such clients. After change ec52fbb is a bit better, but doesn't remove issue completly. How can we debug it and help to address it?

btw on vr publisher from the same users we don't have these cracks

spscream · 2023-11-27T15:25:48Z

I tested it using clamsy utility https://jagt.github.io/clumsy/ - cracks starts after adding drops over 5%

lminiero · 2023-11-27T15:30:10Z

Unless these reports are related to the RNNoise effort, this is the wrong place to talk about it.

spscream · 2023-11-27T15:38:11Z

Unless these reports are related to the RNNoise effort, this is the wrong place to talk about it.

I created #3297 issue to adress it.

mail2mhossain · 2024-02-05T02:25:57Z

We are currently testing your PR. As part of our testing process, we're setting up an audio bridge room with the RNNoise feature activated, by setting the "denoise" option to true.

In certain circumstances, we've noticed an echo effect. This typically happens when using the laptop's built-in speaker and microphone. Interestingly, this issue does not occur when headphones are utilized.

atoppi · 2024-02-05T10:31:44Z

@mail2mhossain that sounds more like an issue with the client environment.

This typically happens when using the laptop's built-in speaker and microphone. Interestingly, this issue does not occur when headphones are utilized

Do your clients have echo cancellation mechanisms?
Typically browsers do and it should be enabled by default if the device supports it.
A quick one-liner test in a browser is the following:

(await navigator.mediaDevices.getUserMedia({ audio: true})).getAudioTracks()[0].getSettings().echoCancellation

lminiero · 2024-02-05T10:37:08Z

More importantly, is this indeed a problem with the RNNoise integration? Meaning, does it happen when RRNoise is enabled in the AudioBridge, but doesn't when the AudioBridge doesn't? If not, it's irrelevant to this PR, and as Alessandro said a client side problem (just wear a headset).

mail2mhossain · 2024-02-06T02:56:35Z

It appears that the issue we're encountering, specifically the echo experienced when using the laptop's built-in speaker and microphone, may indeed be related to the absence of RNNoise integration. We are considering integrating RNNoise as a potential solution to address this challenge.

The issue occurs both when RRNoise is activated in the AudioBridge and also when it's not integrated with the AudioBridge.

Our application utilizes a desktop client that incorporates the Sipsorcery WebRTC library, and for audio functionalities, we're leveraging the SIPSorceryMedia.SDL2 library. Based on your feedback, it appears that echo cancellation features are expected to be handled by the client library, in this case, SIPSorceryMedia.SDL2. Consequently, I'm understanding that RNNoise might not support this specific type of noise cancellation, correct?

mail2mhossain · 2024-02-13T02:27:09Z

Could RNNoise effectively resolve the echo issue we're facing with the laptop's built-in speaker and microphone? Or, would it be more advisable to incorporate a noise reduction library on the client side?

atoppi · 2024-02-15T10:21:50Z

Could RNNoise effectively resolve the echo issue we're facing with the laptop's built-in speaker and microphone? Or, would it be more advisable to incorporate a noise reduction library on the client side?

Noise suppression and echo suppression are very different processes.

Noise suppression subtracts from a generic stream what is being recognized as background noise and could be performed in both media server (audiobridge with RNNoise in this case) or in the endpoint.
On the other hand echo suppression is a process happening on an endpoint that subtracts from the microphone stream the echo that is bouncing back from the loudpseakers.

lminiero · 2024-04-08T16:13:52Z

Merging.

Add optional RNNoise support to AudioBridge

21c1c33

lminiero added the multistream Related to Janus 1.x label Mar 17, 2023

lminiero added 4 commits May 24, 2023 16:22

Aligned with latest changes in master

1302a29

Use two separate denoisers, when dealing with stereo participants

f52afc3

Fixed denoising artifacts when using mono 8000/16000

3c30a17

Fixed broken audio when denoising stereo participants

a594c53

lminiero marked this pull request as ready for review May 25, 2023 15:16

lminiero added 3 commits May 26, 2023 11:30

Move denoising code to a separate function

4060993

Aligned with latest changes

ddb216d

Aligned with new version of the AudioBridge plugin

c49061d

Odinvt mentioned this pull request Sep 12, 2023

T.140 support in SIP plugin, and WebRTC gateway (again! see #1898) #3231

Open

Refactor denoising algorithm

1feba68

atoppi added 2 commits November 16, 2023 20:45

Prefill with zeroes denoised buffers. Define FRAME_SIZE macro. Cast p…

8fc4e30

…tr to avoid warning.

Tiny styling change in index expression

ebf2fbe

atoppi added 2 commits November 20, 2023 19:19

audiobridge: fix boolean setting for the first packet received

fe0b345

audiobrige: use fec only when received packet is expected plus one

ec52fbb

atoppi and others added 6 commits November 22, 2023 15:14

Use speex resampler for upsampling and downsampling

23386f4

Fixed denoise not tweakable via configure requests

812cc05

Do not destroy resamplers when hanging up

36e54a6

Create resamplers just once. Add missing macros.

2ca13b5

Create resamplers later when handling changeroom

975502c

Initialize denoiser stuff only in participant thread. Use quality=8 f…

5189caf

…or resamplers. Always return the denoise flag in the admin response. Add missing docs.

Aligned to latest changes in AudioBridge (suspend/resume)

06d9818

atoppi mentioned this pull request Feb 4, 2024

RNNoiseAudioBridge: we are testing this branch and getting echo #3329

Closed

lminiero mentioned this pull request Mar 28, 2024

Ship speexdsp's jitter buffer as part of local AudioBridge dependencies #3348

Merged

Aligned to latest changes, and local speexdsp dependency

096e945

lminiero merged commit 2490567 into master Apr 8, 2024
8 checks passed

lminiero deleted the rnnoise-audiobridge branch April 8, 2024 16:14

lminiero mentioned this pull request Apr 8, 2024

Add optional RNNoise support to AudioBridge (0.x) #3357

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optional RNNoise support to AudioBridge #3185

Add optional RNNoise support to AudioBridge #3185

lminiero commented Mar 17, 2023

lminiero commented May 25, 2023

IlgamGabdullin commented Aug 21, 2023

lminiero commented Sep 7, 2023

Odinvt commented Sep 8, 2023

lminiero commented Sep 8, 2023

Odinvt commented Sep 8, 2023

Odinvt commented Sep 11, 2023

atoppi commented Sep 12, 2023

Odinvt commented Sep 12, 2023

spscream commented Oct 25, 2023

brave44 commented Oct 27, 2023

lminiero commented Oct 27, 2023

brave44 commented Oct 27, 2023

atoppi commented Nov 16, 2023

Odinvt commented Nov 16, 2023

spscream commented Nov 17, 2023

lminiero commented Nov 20, 2023

spscream commented Nov 20, 2023

spscream commented Nov 21, 2023

spscream commented Nov 21, 2023

spscream commented Nov 27, 2023 •

edited

Loading

spscream commented Nov 27, 2023

lminiero commented Nov 27, 2023

spscream commented Nov 27, 2023

mail2mhossain commented Feb 5, 2024

atoppi commented Feb 5, 2024

lminiero commented Feb 5, 2024

mail2mhossain commented Feb 6, 2024 •

edited

Loading

mail2mhossain commented Feb 13, 2024

atoppi commented Feb 15, 2024

lminiero commented Apr 8, 2024

Add optional RNNoise support to AudioBridge #3185

Add optional RNNoise support to AudioBridge #3185

Conversation

lminiero commented Mar 17, 2023

lminiero commented May 25, 2023

IlgamGabdullin commented Aug 21, 2023

lminiero commented Sep 7, 2023

Odinvt commented Sep 8, 2023

lminiero commented Sep 8, 2023

Odinvt commented Sep 8, 2023

Odinvt commented Sep 11, 2023

atoppi commented Sep 12, 2023

Odinvt commented Sep 12, 2023

spscream commented Oct 25, 2023

brave44 commented Oct 27, 2023

lminiero commented Oct 27, 2023

brave44 commented Oct 27, 2023

atoppi commented Nov 16, 2023

Odinvt commented Nov 16, 2023

spscream commented Nov 17, 2023

lminiero commented Nov 20, 2023

spscream commented Nov 20, 2023

spscream commented Nov 21, 2023

spscream commented Nov 21, 2023

spscream commented Nov 27, 2023 • edited Loading

spscream commented Nov 27, 2023

lminiero commented Nov 27, 2023

spscream commented Nov 27, 2023

mail2mhossain commented Feb 5, 2024

atoppi commented Feb 5, 2024

lminiero commented Feb 5, 2024

mail2mhossain commented Feb 6, 2024 • edited Loading

mail2mhossain commented Feb 13, 2024

atoppi commented Feb 15, 2024

lminiero commented Apr 8, 2024

spscream commented Nov 27, 2023 •

edited

Loading

mail2mhossain commented Feb 6, 2024 •

edited

Loading